Compression and an IR Approach to XML Retrieval

نویسندگان

  • Vo Ngoc Anh
  • Alistair Moffat
چکیده

A two-phase evaluation scheme is proposed for XML retrieval. In the first phase, a modified vector space model is employed to obtain similarity scores for the textual nodes of XML trees. In the second stage, the scores are propagated upward in the XML trees, with scores of the textual nodes being modified and scores of other nodes being generated. As a result, while a vector space ranking is used, the final scores computed do not truly reflect the vector space scores. In addition to the two-phase evaluation, an integrated compressed file system is proposed for both storing and retrieving XML documents. This leads to an efficient representation of XML repositories.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prototyping a Vibrato-Aware Query-By-Humming (QBH) Music Information Retrieval System for Mobile Communication Devices: Case of Chromatic Harmonica

Background and Aim: The current research aims at prototyping query-by-humming music information retrieval systems for smart phones. Methods: This multi-method research follows simulation technique from mixed models of the operations research methodology, and the documentary research method, simultaneously. Two chromatic harmonica albums comprised the research population. To achieve the purpose ...

متن کامل

XML Information Retrieval: An overview

—Locating and distilling the valuable relevant information continued to be the major challenges of Information Retrieval (IR) Systems owing to the explosive growth of online web information. These challenges can be considered the XML Information Retrieval challenges as XML has become a de facto standard over the Web. The research on XML IR starts with the classical IR strategies customized to X...

متن کامل

Vague Content and Structure (VCAS) Retrieval for Data-Centric XML Collection

Retrieving information from EHRs that are represented as XML documents is an important aspect for the users of this domain. Such retrieving may lead to some vague queries. There is an extensive need for designing an IR approach to process these kinds of queries and retrieve relevant XML documents. This paper proposes an IR-approach that decomposes the vague query into two sub-queries, Content-O...

متن کامل

Feedback-Driven Structural Query Expansion for Ranked Retrieval of XML Data

Relevance Feedback is an important way to enhance retrieval quality by integrating relevance information provided by a user. In XML retrieval, feedback engines usually generate an expanded query from the content of elements marked as relevant or nonrelevant. This approach that is inspired by text-based IR completely ignores the semistructured nature of XML. This paper makes the important step f...

متن کامل

Processing Content-And-Structure Queries for XML Retrieval

Document-centric XML collections contain text-rich documents, marked up with XML tags. The tags add lightweight semantics to the text. Querying such collections calls for a hybrid query language: the text-rich nature of the documents suggest a content-oriented (IR) approach, while the mark-up allows users to add structural constraints to their IR queries. We propose an approach to such hybrid c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002